Search CORE

12 research outputs found

Recommended from our members

A systematic literature review of automatic Alzheimer's disease detection from speech and language.

Author: Baker Simon
Korhonen Anna
Petti Ulla
Publication venue: J Am Med Inform Assoc
Publication date: 01/11/2020
Field of study

OBJECTIVE: In recent years numerous studies have achieved promising results in Alzheimer's Disease (AD) detection using automatic language processing. We systematically review these articles to understand the effectiveness of this approach, identify any issues and report the main findings that can guide further research. MATERIALS AND METHODS: We searched PubMed, Ovid, and Web of Science for articles published in English between 2013 and 2019. We performed a systematic literature review to answer 5 key questions: (1) What were the characteristics of participant groups? (2) What language data were collected? (3) What features of speech and language were the most informative? (4) What methods were used to classify between groups? (5) What classification performance was achieved? RESULTS AND DISCUSSION: We identified 33 eligible studies and 5 main findings: participants' demographic variables (especially age ) were often unbalanced between AD and control group; spontaneous speech data were collected most often; informative language features were related to word retrieval and semantic, syntactic, and acoustic impairment; neural nets, support vector machines, and decision trees performed well in AD detection, and support vector machines and decision trees performed well in decline detection; and average classification accuracy was 89% in AD and 82% in mild cognitive impairment detection versus healthy control groups. CONCLUSION: The systematic literature review supported the argument that language and speech could successfully be used to detect dementia automatically. Future studies should aim for larger and more balanced datasets, combine data collection methods and the type of information analyzed, focus on the early stages of the disease, and report performance using standardized metrics

Apollo (Cambridge)

How Much Speech Data Is Needed for Tracking Language Change in Alzheimer’s Disease? A Comparison of Random Length, 5-Min, and 1-Min Spontaneous Speech Samples

Author: Anna Korhonen
Jessica Robin
Simon Baker
Ulla Petti
Publication venue: Karger Publishers
Publication date: 01/11/2023
Field of study

Introduction: Changes in speech can act as biomarkers of cognitive decline in Alzheimer’s disease (AD). While shorter speech samples would promote data collection and analysis, the minimum length of informative speech samples remains debated. This study aims to provide insight into the effect of sample length in analyzing longitudinal recordings of spontaneous speech in AD by comparing the original random length, 5- and 1-minute-long samples. We hope to understand whether capping the audio improves the accuracy of the analysis, and whether an extra 4 min conveys necessary information. Methods: 110 spontaneous speech samples were collected from decades of Youtube videos of 17 public figures, 9 of whom eventually developed AD. 456 language features were extracted and their text-length-sensitivity, comparability, and ability to capture change over time were analyzed across three different sample lengths. Results: Capped audio files had advantages over the random length ones. While most extracted features were statistically comparable or highly correlated across the datasets, potential effects of sample length should be acknowledged for some features. The 5-min dataset presented the highest reliability in tracking the evolution of the disease, suggesting that the 4 extra minutes do convey informative data. Conclusion: Sample length seems to play an important role in extracting the language feature values from speech and tracking disease progress over time. We highlight the importance of further research into optimal sample length and standardization of methods when studying speech in AD

Directory of Open Access Journals

Improved Outcome Prediction for Appendiceal Pseudomyxoma Peritonei by Integration of Cancer Cell and Stromal Transcriptional Profiles

Author: Bellomo Sara Erika
Borsano Alice
Isella Claudio
Medico Enzo
Mignone Andrea
Petti Consalvo
Picco Gabriele
Pisacane Alberto
Porporato Roberta
Robella Manuela
Sapino Anna
Simone Michele De
Ulla Alexandra Ambra
Vaira Marco
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Institutional Research Information System University of Turin

Recommended from our members

How Much Speech Data Is Needed for Tracking Language Change in Alzheimer's Disease? A Comparison of Random Length, 5-Min, and 1-Min Spontaneous Speech Samples.

Author: Baker Simon
Korhonen Anna
Petti Ulla
Robin Jessica
Publication venue: Digit Biomark
Publication date: 01/01/2023
Field of study

INTRODUCTION: Changes in speech can act as biomarkers of cognitive decline in Alzheimer's disease (AD). While shorter speech samples would promote data collection and analysis, the minimum length of informative speech samples remains debated. This study aims to provide insight into the effect of sample length in analyzing longitudinal recordings of spontaneous speech in AD by comparing the original random length, 5- and 1-minute-long samples. We hope to understand whether capping the audio improves the accuracy of the analysis, and whether an extra 4 min conveys necessary information. METHODS: 110 spontaneous speech samples were collected from decades of Youtube videos of 17 public figures, 9 of whom eventually developed AD. 456 language features were extracted and their text-length-sensitivity, comparability, and ability to capture change over time were analyzed across three different sample lengths. RESULTS: Capped audio files had advantages over the random length ones. While most extracted features were statistically comparable or highly correlated across the datasets, potential effects of sample length should be acknowledged for some features. The 5-min dataset presented the highest reliability in tracking the evolution of the disease, suggesting that the 4 extra minutes do convey informative data. CONCLUSION: Sample length seems to play an important role in extracting the language feature values from speech and tracking disease progress over time. We highlight the importance of further research into optimal sample length and standardization of methods when studying speech in AD.Economic and Social Research Council (ESRC) Cambridge Doctoral Training Partnership (DTP) grant number ES/P000738/1

Apollo (Cambridge)

Recommended from our members

The generalizability of longitudinal changes in speech before AD diagnosis

Author: Baker Simon
Korhonen Anna
Petti Ulla
Robin Jessica
Publication venue: Journal of Alzheimer's Disease
Publication date: 18/01/2023
Field of study

ABSTRACT Background: Language impairment in Alzheimer’s disease (AD) has been widely studied but due to limited data availability, relatively few studies have focused on the longitudinal change in language in the individuals who later develop AD. Significant differences in speech have previously been found by comparing the press conference transcripts of President Bush and President Reagan, who was later diagnosed with AD. Objective: In the current study, we explored whether the patterns previously established in the single AD-healthy control (HC) participant pair apply to a larger group of individuals who later receive AD diagnosis. Methods: We replicated previous methods on two larger corpora of longitudinal spontaneous speech samples of public figures, consisting of 10 and 9 AD-HC participant pairs. As we failed to find generalizable patterns of language change using previous methodology, we proposed alternative methods for data analysis, investigating the benefits of using different language features and their change with age, and compiling the single features into aggregate scores. Results: The single features that showed the strongest results were moving average type:token ratio (MATTR) and pronoun-related features. The aggregate scores performed better than the single features, with lexical diversity capturing a similar change in two thirds of the participants. Conclusion: Capturing universal patterns of language change prior to AD can be challenging, but the decline in lexical diversity and changes in MATTR and pronoun-related features act as promising measures that reflect the cognitive changes in many participants

Apollo (Cambridge)

Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity

Author: Baker Simon
Bar Eden
Korhonen Anna
Leviant Ira
Majewska Olga
Malone Matt
Petti Ulla
Poibeau Thierry
Ponti Edoardo
Reichart Roi
Vulic Ivan
Wing Kelly
Publication venue: Computational Linguistics
Publication date: 01/01/2020
Field of study

We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering data sets for 12 typologically diverse languages, including major languages (e.g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e.g., Welsh, Kiswahili). Each language data set is annotated for the lexical relation of semantic similarity and contains 1,888 semantically aligned concept pairs, providing a representative coverage of word classes (nouns, verbs, adjectives, adverbs), frequency ranks, similarity intervals, lexical fields, and concreteness levels. Additionally, owing to the alignment of concepts across languages, we provide a suite of 66 cross-lingual semantic similarity data sets. Due to its extensive size and language coverage, Multi-SimLex provides entirely novel opportunities for experimental evaluation and analysis. On its monolingual and cross-lingual benchmarks, we evaluate and analyze a wide array of recent state-of-the-art monolingual and cross-lingual representation models, including static and contextualized word embeddings (such as fastText, monolingual and multilingual BERT, XLM), externally informed lexical representations, as well as fully unsupervised and (weakly) supervised cross-lingual word embeddings. We also present a step-by-step data set creation protocol for creating consistent, Multi-Simlex -style resources for additional languages. We make these contributions - the public release of Multi-SimLex data sets, their creation protocol, strong baseline results, and in-depth analyses which can be be helpful in guiding future developments in multilingual lexical semantics and representation learning - available via a website which will encourage community effort in further expansion of Multi-SimLex to many more languages. Such a large-scale semantic resource could inspire significant further advances in NLP across languages

arXiv.org e-Print Archive

Edinburgh Research Explorer

Apollo (Cambridge)